Metadata-Version: 2.1
Name: audio, caption crawler and processor
Version: 0.0.1
Summary: A pakage for crawling and processing audio, caption from Youtube
Home-page: https://github.com/zldzmfoq12/aud_crawler
Author: Seugnhun Jeong
Author-email: zldzmfoq12@naver.com
License: UNKNOWN
Download-URL: https://github.com/zldzmfoq12/aud_crawler/archive/0.0.tar.gz
Description: Audio, Caption Crawler and Processor
        =====================================
        
        
        ##### Downloads and processes the audios and captions(subtitles) from Youtube videos for Speech AI
        
        
        
        Requirements
        -------------
        
        * Currently requires python >= 3.6
        * FFmpeg
        
        To Use
        --------
        
              from accp import ACCP
        
              playlist_name=""
              playlist_url = ""
        
              accp = ACCP(playlist_name, playlist_url)
              accp.download_audio()    #download audio from youtube
        
              accp.download_caption()  #download captions from youtube
        
              accp.audio_split()       #split 
        
        
        Results
        ----------
           
              datasets
                |- playlist name
                    |- metadata.csv
                    |- alignment.json
                    |- wavs
                         ├── 1.wav
                         ├── 2.wav
                         ├── 3.wav
                         └── ...
          
           
           and `metadata.csv` should look like:
        
            {
                0001.wav|그래서 사람들도 날 핍이라고 불렀다.,
                0002.wav|크리스마스 덕분에 부엌에 먹을게 가득했다.,
                0003.wav|조가 자신이 그 사람이라고 나섰다.,
                ...
            }
            
           and `alignment.json` should look like:
        
            {
                "./datasets/playlist name/wavs/0001.wav": "그래서 사람들도 날 핍이라고 불렀다.",
                "./datasets/playlist name/wavs/0002.wav": "크리스마스 덕분에 부엌에 먹을게 가득했다.",
                "./datasets/playlist name/wavs/0003.wav": "조가 자신이 그 사람이라고 나섰다.",
            }
        
        
Keywords: aud_crawler
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown
